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Minimax  Subset  Selection  for 
Loss  Measured  by  Subset  Size 

1.  INTRODUCTION.  A subset  selection  problem  may  be  formulated  as  a multiple 

decision  problem.  The  distinguishing  feature  of  a subset  selection  problem  is 

the  goal  of  determining  in  which  of  k partition  sets  of  the  parameter  space 

the  true  parameter  lies.  In  subset  selection  problems,  attention  is  usually 

* 

restricted  to  rules  which  insure  a certain  minimum  probability,  P , of  making 

a correct  decision.  Restricting  attention  to  these  rules,  minimaxity  is 

investigated  for  loss  measured  by  subset  size  and  number  of  non-best  populations 

+ ★ 

selected.  The  minimax  values  are  found  to  be  kP  and  (k-l)P  , respectively, 

under  general  conditions  involving  only  the  topological  structure  of  the 

parameter  space  and  the  continuity  of  certain  functions  of  the  parameter. 

These  results  include  problems  involving  nuisance  parameters  and  (possibly 

unequal)  sample  sizes  greater  than  one.  Using  these  results,  rules  proposed 

by  Gupta  (1965)  are  found  to  be  minimax  in  location  and  scale  parameter 

problems  when  the  populations  are  independent  and  the  densi  ave  monotone 

likelihood  ratio.  Other  rules,  proposed  for  selection  in  term-  j f binomial  and 

multinomial  probabilities  and  multivariate  non-centrality  parameters,  are 

shown  to  be  not  minimax.  A class  of  rules,  proposed  by  Seal  (1955)  for  the 

location  parameter  problem,  is  also  investigated.  For  certain  values  of  k 
* 

and  P , rules  in  this  class  are  shown  to  be  not  minimax. 

2.  MULTIPLE  DECISION  THEORY  FORMULATION.  A subset  selection  problem  may  be 
formulated  as  a multiple  decision  theory  problem.  The  specific  choice  of  the 
action  space  sets  the  subset  selection  problem  apart  from  other  multiple 
decision  theory  problems. 
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The  sample  space  X is  a subset  of  q-dimensional  Euclidean  space 

The  parameter  space  b is  a subset  of  ER  . The  observation  X = (X^, ,X^) 

is  a random  vector  with  cumulative  distribution  function  (c.d.f.)  F(£;jT). 

It  is  assumed  that  there  exists  a partition  of  6 denoted  by  (fch:  i=l,...,k) 

(k  ^2).  Often  this  partition  is  determined  by  the  largest  or  smallest 
coordinate  of  (some  subset  of)  the  parameter.  If  a particular  parameter  point 
could  be  placed  in  more  than  one  set  of  this  partition,  e.g.,  two  coordinates 
of  the  parameter  are  tied  as  largest,  then  the  point  is  arbitrarily  put  in  one 
of  the  sets.  This  is  done  so  the  partition  is  well  defined  and,  in  some 
problems,  this  insures  the  continuity  of  the  risk  functions.  The  general  goal 
of  a subset  selection  problem  is  to  determine,  based  on  the  observation, 
which  of  the  k partition  sets  contains  the  true  parameter.  The  action  space  G 

consists  of  the  2 -1  non-empty  subsets  of  Oj, where  ik  is  the  statement 

SefcK.  So  the  action  corresponds  to  the  decision  The  ^-'s 

correspond  to  what  have  been  called  populations  in  the  earlier  subset  selection 
literature.  In  this  terminology,  for  a given  0,  the  "best  population"  is  the 
one  true  ir^  and  the  other  (k-1)  ir^'s  are  the  "non-best  populations."  So  a 
statement  like, "the  best  population  is  the  one  associated  with  the  largest 
parameter  value,"  means  ©.  = {0:  0.  = max  9.)  (with  the  exception  that  if  0.  js 

1 ” 1 l^tT  J 1 

tied  with  other  B^'s  as  the  largest,  that  parameter  point  may  not  be  in  feh) . 

By  not  assuming  equality  of  k,  q and  r,  this  formulation  covers  problems 
involving  nuisance  parameters  and  (possibly  unequal)  sample  sizes  greater 
than  one.  The  o-field  associated  with  the  sets  X ,fe  and  G will  be  the  discrete 
a-field  if  the  set  is  countable  and  the  Borel  o-field  if  the  set  is  uncountable. 
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A measurable  function,  6:  X x C -+■  [0,1],  is  called  a selection  rule 

provided  that,  for  each  x e X,  E6(x,a)  = 1.  6fa,a)  is  the  probability  of 

C 

selecting  subset  a having  observed  x.  The  k functions  defined  by  cp.  (x)  = 

y 6(x,a)  are  the  individual  selection  probabilities,  cp.  (x)  is  the 

{aiw.ea}  " 1 " 

probability  of  including  ir^  in  the  selected  subset  having  observed  x.  A 

selection  rule  is  not,  in  general,  completely  determined  by  its  individual 

selection  probabilities  (see  Nagel  (1970),  Example  1.2.1).  But  the  risk  of 

any  rule,  for  losses  defined  in  terms  of  the  quantities  (2.3),  can  be  computed 

in  terms  of  the  individual  selection  probabilities.  For  this  reason,  any  two 

rules  which  have  the  same  individual  selection  probabilities  shall  be  considered 

equivalent. 

The  selection  of  any  subset  which  contains  the  best  population  is  called 

* 

a correct  selection,  denoted  by  CS.  Let  P be  any  pre-assigned  fixed  number 

* 

such  that  1/k  < P <1.  It  has  been  traditional  in  the  literature  to  consider 

* 

only  selection  rules  which  satisfy  the  P -condition,  viz., 

(2.1)  inf  PQ(CS!<p)  > P*. 

© — 

This  is  obviously  equivalent  to  the  following  k inequalities  being  satisfied, 

(2.2)  inf  E cp.  (X)  = inf  P (select  ir.(cp)  > P , i=l k. 

6.  — 0.  — 

l l 

★ 

The  set  of  all  selection  rules  which  satisfy  the  P -condition  is  denoted  by 
Having  insured  a high  probability  of  correct  selection  through  the 

★ 

P -condition,  one  would  prefer  a rule  which  selects  small  subsets,  that  is, 
a rule  which  rejects  non-best  populations  effectively.  To  reflect  this,  the 
loss  in  a subset  selection  problem  might  be  measured  in  several  ways.  The 


criteria  used  in  this  paper  are  the  following. 


(2.3) 


i)  Number  of  populations  selected  (S) 
ii)  Number  of  non-best  populations  selected  (S'). 
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So  the  risk  of  a selection  rule,  R(£,cp),  is  given  by  i)  the  expected  subset 
size,  E0(S|cp),  or  ii)  the  expected  number  of  non-best  populations  selected, 

Ee(S'|cp). 

3.  MINIMAX  VALUES  FOR  LOSSES  S AND  S'.  A selection  rule  cp  is  minimax 
with  respect  to  S if 

(3.1)  sup  E (S  | cp  ) = inf  sup  E (S|tp). 

© — v ® - 

The  value  on  the  right  side  of  (3.1)  is  called  the  minimax  value  with  respect 
to  S of  the  selection  problem.  Minimaxity  with  respect  to  S'  is  defined  by 
replacing  S with  S’  in  (3.1). 

Schaafsma  (1969)  considered  minimaxity  in  multiple  decision  problems  in  a 

very  general  setting.  But  he  did  not  restrict  attention  to  rules  which  satisfy 

* 

the  P -condition.  In  this  unrestricted  problem  he  found  that  a minimax  rule 
(with  respect  to  S or  S')  never  selects  a subset  consisting  of  more  than  one 
population.  This  will  certainly  not  be  the  case  in  the  restricted  minimaxity 
of  (3.1). 

The  following  subset  of  the  parameter  space  will  be  of  interest  in  finding 
the  minimax  values.  Let  ©^  = {0e6:  0e©^  for  all  i=l,...,k>  where  A denotes  the 

closure  of  A. 

Theorem  3.1.  Suppose  ©q  is  non-empty.  Suppose  there  exists  _£qE©0  such  that 
P0(select  tk  | tp)is  upper  semicontinuous  at  J0Q  for  all  epe^,*  and  all  i=l,...,k. 

Then  the  minimax  value  with  respect  to  S is  kP  and  the  minimax  value  with 

* 1 

respect  to  S'  is  (k-l)P  . 


JJ 
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Proof.  Let  . . »7r(]c_1)  denote  the  k-1  non-best  populations  and  it  the 

best  population  at  0^.  Then  the  risks  at  0^  are 

k 

(3.2)  E (S  | cp)  = | P (select  ir,.Jcp) 

-0  i=l  -0  '-1-' 

k-1 

(3.3)  E (S  * I cp)  = l P„  (select  ir,.,|cp). 

^0  i=l  ^0 

The  "no  data  rule"  defined  by  cp.  (x)  = P , i=l,...,k,  has  PQ  (select  ir^|<p*) 

= P*  for  all  0_  and  all  i.  So  E Q (S | cp* ) = kP*  and  EQ(S'|cp*)  = (k-1) P*  for  all  £ 

★ ★ 

and  the  minimax  values  can  be  no  greater  than  kP  and  (k-l)P  respectively. 

On  the  other  hand,  let  ©^  be  the  subset  of  6 where  is  best.  Since 

0qE©^,  and  P0  (select  ir^|cp)  is  upper  semi  continuous  at 

PQ  (select  ir^lcp)  1 inf  PQ  (select  77 (- ± ) I 

~°  ®(i)  " 

= inf  P (CS|  cp)  > P* 

fa(i) 

for  any  cpeA>p*.  So 

(3.4)  sup  E ( S J cp)  > E (S  | cp)  > kP* 

e - 


sup  E0 (S ’ | cp)  > Eq  (S’icp)  > (k-l)P 
© — -0 


for  any  cpEJ&p*.  Thus  the  minimax  values  can  be  no  less  than  kP  and  (k-l)P 
respectively. | | 

Remark  3.1.  The  hypothesis  that  ©^  is  non-empty  is  usually  satisfied.  If 

© = Ixlx...xl  (k  times)  where  I is  an  interval  on  the  real  line  and  if  the 

best  is  defined  in  terms  of  the  largest  or  smallest  coordinate  of  the  parameter, 

then  G = (0  = (0,0,. ..,0):  0el).  If  has  a multinomial  distribution, 

U k 

© = {(©,,..., 0. ) : 0.^0,  l 0.  = 1).  If  the  best  population  is  the  coordinate 

1 K l i=1  1 

associated  with  the  largest  or  smallest  coordinate  of  the  parameter,  then  ©^  is 
the  single  point  (1/k, . . . , 1/k ) . It  should  be  noted  that  in  both  of  these 
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examples,  the  determination  of  ©^  did  not  depend  on  which  population  was  tagged 
as  best  in  those  cases  where  two  or  more  of  the  coordinates  were  equal  and 
largest  (or  smallest).  It  may  be  argued  that  in  problems  like  the  above,  any 
action  is  acceptable  to  the  experimenter  if  _9e©Q.  In  this  case,  one  would  set 
RCS.T)  = 0 for _0e©Q.  But,  even  allowing  this.  Theorem  3.1  remains  true,  in  the 
usual  (see  Remark  3.2)  case  where  Pg  (select  it.  | <p)  is  continuous  in  _0,  for  (3.4) 
can  be  replaced  by 

sup  E (S | cp)  _>  lim  E (S  | cp)  > kP* 

Cl  ° Q ” 


and  similarly  for  S'. 


j 


Remark  3.2.  The  upper  semicontinuity  assumption  of  Theorem  3.1  is  much  less 
formidable  them  it  appears.  For  example,  Chung  (1970)  (problem  10,  page  100) 
can  be  generalized  to  state  that  if  has  a density  f(x_;0)  with  respect  to  a 
sigma  finite  measure  y and  if  f(3c;0)  is  continuous  at  (as  a function  of  9) 
for  almost  all  (y)  x^,  then  EgipQC)  is  continuous  at  for  any  bounded  ip.  Since 

PQ  (select  it . | cp)  = EgCp^()()  and  0 <_  cp^  <_  1,  this  shows  that  Pg  (select  tt^  | cp)  will 
be  a continuous  function  of  0 on  © for  any  cp  in  any  problem  with  densities 
which  are  (almost  everywhere)  continuous  in  the  parameter. 

Theorem  3.1  indicates  a relationship  between  minimaxity  with  respect  to 
S and  S'.  Theorem  3.2  shows  that  minimaxity  with  respect  to  S'  is  more  easily 
achieved  than  minimaxity  with  respect  to  S. 


Theorem  3.2.  Under  the  assumptions  of  Theorem  3.1,  if  cp  e is  minimax  with 
respect  to  S,  then  cp  is  minimax  with  respect  to  S' . 


I 
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Proof. 


sup  EQ  (S ' | cp)  = sup{E0(S|  cp)-Pe(CS(  cp)  } 
b £ ~ ~ 

<_  sup{E0  (S  | cp)  -P  } 

© — 

= kP*-P*  = (k-l)P*. | | 

Theorems  3.1  and  3.2  can  be  used  to  show  that  in  location  and  scale 
parameter  problems,  two  rules  proposed  by  Gupta  (1965)  are  minimax.  In  the 
following,  we  will  consider  the  case  in  which  the  population  associated  with 
the  largest  parameter  value  is  best.  With  the  appropriate  modifications, 
analogous  results  could  be  obtained  if  the  population  associated  with  the 
smallest  parameter  value  is  best. 

Gupta  (1965)  proposed  and  studied  the  following  two  rules.  For  a 


location  parameter  problem,  define  the  rule  by 


(3.5) 


where  d > 0 


select  ir.  if  x.  > max  x.  -d 
1 1 ”l<j<k  ^ 


i=l , . . . ,k 


llest  constant  such  that  the  P -condition  is  satisfied. 


For  a scale  -ter  problem,  define  the  rule  R2  by 


(3.6) 


R_:  select  it.  if  x.  > c*  max  x. 

2 11  — 


l<j<k  3 


i=l , . . . ,k 


where  0 < c < 1 is  the  largest  constant  such  that  the  P -condition  is  satisfied. 

Theorem  3.3.  Suppose  X^,...,X^  are  independent.  Suppose  0_  is  a location 

(scale)  parameter  and  has  density  f0  (x.)  = f (x^-0 . ) (f (x^/0^) /0^)  with 

i 

respect  to  Lebesgue  measure,  y,  on  the  real  line  ((0,"  )).  Suppose  f0(x) 
has  monotone  likelihood  ratio.  Then  R^(R2)  is  minimax  with  respect  to  S 
and  S ' . 

Proof.  Gupta  (1965)  proved  that  under  the  assumptions  of  independence  and 

monotone  likelihood  ratio, 

sup  E0(S|R1(R2))  = sup  E0(S|R1(R2))  = kP*. 

© - bQ  - 
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The  continuity  assumption  of  Theorem  3.1  is  satisfied  for  any  location  (scale) 
parameter  density  with  respect  to  Lebesgue  measure  (see  Royden  (1968)  problem  17, 
chapter  4).  The  result  follows  from  Theorems  3.1  and  3.2. || 

Theorem  3.3  generalizes  a result  of  Gupta  and  Studden  (1966).  They  proved 
that  R^^)  is  minimax  among  all  permutation  invariant  rules  in  Aj>*.  Theorem 
3.3  proves  minimaxity  among  all  rules  in  j£>p*. 

4.  NECESSARY  CONDITIONS  FOR  MINIMAXITY.  Any  minimax  selection  rule  must 
satisfy  certain  equalities  on  the  set  ©^.  These  necessary  conditions  are 
principally  of  use  in  proving  that  certain  rules,  in  violating  these  conditions, 
are  not  minimax.  Theorem  4.1  provides  the  necessary  conditions  for  minimaxity 
with  respect  to  S and  Theorem  4.2  the  analogous  conditions  for  S'. 


Theorem  4.1.  Let  cp  be  a minimax  rule  with  respect  to  S.  Suppose 
Pg(select  w.jtp)  is  upper  semi  continuous  for  all  i*l,...k  at  6^e8q.  Then 

a)  P (select  it  . | cp)  = P = inf  Pfl(CS|cp)  for  all  i=l,...,k 

^0  1 © - 

b)  P (CS|cp)  = P*  = inf  P (CS|cp) 

^0  © - 

c)  E0  (S  | cp)  = kP  = sup  Eq(S  | cp)  . 

—0  © — 


Remark  4.1.  Condition  (a)  of  Theorem  4.1  implies  condition  (b)  and  the  first 
equality  in  (c)  as  well  as  (a)  and  the  first  equality  in  (b)  of  Theorem  4.2. 

If  one  wishes  to  verify  these  conditions  for  a given  rule  to  check  if  it  might 
be  minimax,  only  4.1(a)  need  be  verified. 


Proof.  As  in  the  proof  of  Theorem  3.1,  it  follows  that 

i * 

P0  (select  I cp)  >_  P for  all  i=l,...,k. 


(4.1) 


for  all  i=l 


» • • * » 


9 


By  considering  the  "no  data  rule"  9 00  = P , it  follows  that  the  minimax  value 

* 

is  no  greater  than  kP  so,  since  cp  is  minimax  and  (4.1)  is  true, 
kP  > sup  P (S|  cp)  > E CS  | cp) 

~ © - “ -So 

k 

= l PQ  (select  ir.  |<p)  >_  kP  . 

i=l  -0 

All  the  inequalities  are  equalities  and  (a)  and  (c)  are  true,  (b)  follows  from 

(a)  since  Pft(CS|cp)  = P (select  tt . | cp)  were  _0_e©. . | | 

^0  1 -U  1 

Nagel  (1970)  found  that  a condition  related  to  4.1(b),  viz., 

inf  P (CS  | cp)  = inf  P (CS  | cp)  , 

6 - % 1 

was  an  important  property  of  just  selection  rules.  Conditions  4.1(a)  and  (b) 
have  long  been  recognized  (cf.  Gupta  and  Studden  (1966))  as  intuitively 
appealing  properties  of  selection  rules.  This  is  especially  true  for  those 
problems  in  which  ©^  consists  of  those  parameter  points  at  which  one  of  the  k 
populations  has  arbitrarily  been  tagged  as  best,  e.g. , a location  or  scale 
parameter  problem  in  which  best  has  been  defined  in  terms  of  the  largest  or 
smallest  parameter.  Theorem  4.1  verifies  that,  in  terms  of  minimaxity 
considerations,  the  intuition  is  justified. 

Theorem  4.2.  Let  cp  be  a minimax  rule  with  respect  to  S’.  Suppose 
P0  (select  tk  | cp)  is  upper  semicontinuous  for  all  i=l,..vk  at  e^e©^.  Then 


0 

a) 

b) 


PQ  (select  irjcp) 


= P = inf  P _ (CS  | cp)  for  all  i=l,...,k,  i^j,  where  0»eG. 

© - 3 


E0  (S'  | cp)  =(k-l)P*  = sup  E (S'  | cp) 
— 0 © — 


Proof.  Similar  to  Theorem  4.1. || 
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Remark  4.2.  For  those  problems  in  which  the  random  variables  are 

exchangeable  for  9e<&q  and  the  rule  cp  is  invariant  under  permutations  (symmetric), 
the  following  is  true  for  any 

P0  (select  Tr1|cf)  = PQ  (select  tt2  | =•••=  P0  (select  rk  | cp) . 


Example  4.2.  Consider  the  binomial  selection  problem  in  which  Xj,...,X^  are 

independent  binomial  random  variables  with  success  probabilities  0^, , 0^. 

b0  » ( • • • .6^)  =...=  0^  = 0,  0 <_  0 ^ 1 }.  Gupta  and  Sobel  (1960)  proposed 

using  the  rule  R^  (see  (3.5))  to  select  a subset  including  the  population 
associated  with  the  largest  0^.  These  authors  realized  that  (S | R ^ ) was  not 

constant  on  60,  as  required  by  4.1(c)  if  R^  were  to  be  minimax.  Indeed, 

Eg  (S  | Rj)  -*•  k as  0_  (1,...,1)  and  EQ  (S  * | R^ ) -*■  (k-1).  The  same  is  true  for 

the  arcsin  transformation  proposed  by  these  authors. 

Example  4.3.  The  following  general  problem  has  been  considered  by  Gupta  and 

Panchapakesan  (1972).  Suppose  are  independent  populations  with 

absolutely  continuous  distributions  F0  (x.)  where  0.el  (an  interval  on  the  real 

i 

line).  The  family  { F : 0el)  is  assumed  to  be  stochastically  increasing  in  0. 

Gupta  and  Panchapakesan  investigated  a class  of  procedures  for  selecting  a 
subset  containing  the  population  associated  with  the  largest  0^  defined  by: 

(4.2)  R.  : select  tt.  iff  h(x.)  > max  x. 

h 1 1 " l<j<k  3 

where  h is  a real  valued  function  satisfying  certain  regularity  conditions. 

&0  = ( (0, . . . ,0)  : 0el).  For  any  0_  = (0, . . . , 0)e©Q  and  i=l , . . . ,k, 

(4.3)  P0  (select  ir.|Rh)  = / F0_1  (h(x))dF0(x) . 

By  Theorems  4.1  and  4.2,  if  the  procedure  R^  is  to  be  minimax  with  respect  to 
S or  S',  (4.3)  must  be  constant  on  ©0.  But  Gupta  and  Panchapakesan  have 
found  conditions  under  which  (4.3)  will  be  strictly  increasing  in  0.  Gupta 
and  Studden  (1970)  have  established  the  strict  monotonicity  of  (4.3)  for  the 
non-central  x and  non-central  F distributions  when  R^  is  R^  (see  (3.6)). 

This  is  of  interest  in  the  problem  of  selection  in  terms  of  Mahalanobis 
distance  for  multivariate  normal  distributions.  Gupta  and  Panchapakesan  (1969) 


i 
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have  established  the  strict  monotonicity  of  (4.3)  in  the  problem  of  selection 

in  terms  of  the  largest  (or  smallest)  multiple  correlation  coefficient  when  R^ 

is  R2  (or  an  analogous  rule).  Both  the  conditional  and  unconditional  cases 

2 

were  considered  as  well  as  two  different  statistics,  R , the  sample  multiple 

*2  2 2 

correlation  coefficient,  and  R = R /(1-R  ).  In  violating  Theorems  4.1(a) 
and  4.2(a),  all  of  the  above  rules  are  not  minimax  with  respect  to  S or  S'. 


Remark  4.3.  The  fact  that  the  above  rules  are  not  minimax  was  previously 

reported  in  some  cases.  But  the  interesting  point  is  that  ore  need  not 

always  examine  E „ (S  j cp)  or  EQ(S'|cp)  to  determine  that  a rule  is  not  minimax. 

Often  in  investigating  the  least  favorable  configuration,  i.e.,  that  9^  for 

which  P (CSjcp)  = inf  P.(CS|cp),  one  can  reduce  the  problem  .o  investigating 
^0  © - 

inf  PQ(CS]cp).  This,  for  example,  is  the  case  for  just  rules  as  defined  by 
fe0  “ 

Nagel  (1970)  and  Gupta  and  Nagel  (1971).  If  one  finds  that  PQ(CS|cp)  is  not 
constant  on  ©^  (and  some  mild  continuity  assumptions  are  satisfied) , then  R 
is  not  minimax.  Thus,  the  only  analysis  required,  to  show  that  a proposal  rule 
is  not  minimax,  may  be  the  analysis  used  to  find  the  least  favorable 
configuration. 


5.  MINIMAXITY  CONSIDERATIONS  FOR  SEAL’S  CLASS.  Seal  (1955)  proposed  a class 

of  rules  for  the  location  parameter  problem.  The  rules  were  proposed  for  the 

independent  normal  populations  problem  but  might  reasonably  be  used  in  any 

location  problem.  In  this  section,  a lower  bound  is  obtained  for  sup  E0(S|cp) 

© - 

and  sup  E (S'|  <f)  for  rules  in  this  class.  This  lower  bound  is  then  used  to 
6 - 

prove  that,  in  certain  cases,  the  rules  in  this  class  are  not  minimax. 
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Definition  5.1.  Let  denote  the  class  of  selection  rules  having  the  form: 


K.  - 1 

cp  : select  it.  iff  x.  > 7 a.xr.-i- 

£ 1 i ~ 1 []] 


where  xri1  <...<  xr,  , , are  the  ordered  observations  excluding  x.  , a.  are 
[1]  - - [k-1]  6 i’  j 

k-1 

non-negative  constants  with  £ a.  = 1,  and  d is  the  smallest  positive  constant 

j=l  J 

★ 

for  which  the  P -condition  is  satisfied. 

Rj  (see  (3.5))  is  in  g and  corresponds  to  setting  a^  ^=1,  a^.=0,  j=l , . . . ,k-2. 

Comparisons  between  Eq(S|Rj)  and  EQ(S|<f)  for  certain  other  rules,  have 

previously  been  made  by  Seal  (1957)  and  Deely  and  Gupta  (1968).  These  authors 

considered  specific  parameter  configurations  (e.g.,  slippage  configurations) 

and  specific  alternatives  to  R^.  In  the  following  results,  the  sup  over  all 

parameter  configurations  and  all  rules  in  5g  are  considered.  But,  as  have  the 

previous  authors'  works,  these  results  shed  some  favorable  light  on  Rj . 

v 

Throughout  this  section,  it  will  be  assumed  that  6 = 3R  . The  c.d.f.  of  X_ 

is  f(x-e) . The  following  notation  will  be  used.  0^  <...<  will  denote 

the  ordered  coordinates  of  £ = (6j,...,0^)  so  that  the  best  population  is  the 

(unknown)  one  associated  with  0^j*  Sometimes,  a sequence  of  parameter  points 

<Q  > will  be  considered  in  which  case  9 ri1  <...<  0 M , will  denote  the 
n n [ 1 J — — n[k] 

ordered  coordinates  of  0 = (0  . , . . . ,0  , ) . 

-n  nl  nk' 

Theorem  5.1  will  be  used  to  obtain  a lower  bound  on  the  expected  subset 
size.  As  stated,  it  also  points  out  an  intuitively  undesirable  property  of  all 
rules  in  Sg , except  Rj,  namely,  there  exist  parameter  points  such  that 
6[k]”0r,c  1]  *S  ar^^trar^y  large  but  the  probability  of  including  the 
population  associated  with  0^  in  the  selected  subset  is  arbitrarily  near 


Theorem 


5.1.  Let  cpa  e S^\{Rj).  Let  r=min  (i:  a.  > 0}.  Then  there  exists 


sequence  of  parameter  points  <£n>  and  a subset  K^{l,...,k}  of  size  k-r-1  such 

that  for  i e K,  lim  0 ,,  .-9  . =<•>  and  lim  P„  (select  it.  I cc  ) = 1. 

n[k]  ni  d i 1 Ya 

n-**  1 n-*®  -n  — 
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Remark  6.1.  For  cp_  = R, . k-r-l=0  so  the  theorem  is  vacuously  true  for  R,  also 
' 3 1 1 

For  any  tp  I R. , r < k-2  so  K will  be  non-empty. 

cL  X 


k-l 


Proof.  Let  S.  = 

l 


{)c:  >_  l a^x^-j-d}  be  the  selection  region  for  ik  using 


<P  . Define  a sequence  of  subsets  of  Zby 

a 


(5.1) 


A = {x: 
n — 


2n  > x.  > n,  n > x.  > -d,  j=r+l ,r+2, . . . ,k-l , 
K J 


cn  >_  xit  i-1,2, . . . ,r) 


where  c = (-n-2n  a,  ,)/a  . 
n k-l'  r 

Let  K = (r+1, ,k-l).  First  it  will  be  shown  that  A cS.  for  all  jeK, 

n j J 

for  all  large  n.  Since  a^  ^ ^ 0 and  a^  >0,  c^  <_  -n/a^  < _d  for  all  large  n. 
Fix  such  an  n and  jeK.  Let  x e An>  Then  x^  ^ = x^,  {xjr+j j *• . -x^  2]  ^ = 
(xr+i,...  ,xk  ^Vx  } (this  set  is  empty  if  r=k-2)  and  (x  , . . . »Xjrj  ) = 
(Xj,...,x  }.  Using  these  facts  and  (5.1),  (5.2)  and  (5.3)  are  obvious. 

(5-2)  Vl  x[k-l]  * ar  x[r]  i Vl'2"  * Vn  = *" 

k-2 

(5’3)  m=Ll  a,nXIml  - max{x[r+l]’*",X[k-2]}  -n 


Using  (5.2),  (5.3)  and  the  fact  that  a^  = 0 m=l,...,r-l  it  follows  that 


rH 
1 1 

k-l 

• 

(5.4) 

/ a xr  ,■ 
L,  m [m] 
m=l  1 3 

-d  = l 

m=r 

a xr  -,-d  < -n+n-d  = -d. 
m [a]  — 

But  x.  > 
J 

-d  by  (5.1) 

so  xeS.. 
- J 

This  is  true  for  any  x e A so  A cS,  for  all 

— n n j 

jeK. 

Define  a sequence  of  parameter  points  9^  = (0^,...,  V ** 


\ 3n/2 

j 

= k 

(5.5) 

0 . = / n/2 
nJ  \ 

j 

/ c -n 
/ n 

j 

= 1* . . • • r 

For  any  jeK,  lim  0 - 6 . * lim  (3n/2-n/2)  = ®. 

n->®  n >•  J n-xx) 
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P (A  ) = P (A  -0  ) 
0 n'  O n -n 


-n  — 

(5.6)  = P(n/2  > Y > -n/2,  n/2  1 Y.  > -n/2-d, 

j = r+l,...,k-l,  n >_  Y^  > -<*>,  i=l,...,r) 

where  Y_  = (Y^,...,Y^)  has  c.d.f.  F(^).  (5.6)  converges  to  1 as  n-*00  since  all 

the  limits  converge  to  ® or  -»  as  appropriate.  Since  An  c for  all  j e K, 

(5.7)  lim  PQ  (select  ir  | cpa)  = lim  PQ  (S.)  >_  lim  P0  (An)  = 1 

n-**>  -n  J — n-*®»  -n  J n-*®  ~n 


for  all  j e K.  | | 


Theorem  5.2.  Let  cp  e Let  r=min  { i : a.  > 0).  Then 

a i 


a)  sup  E0  (S  | cpa)  >_  k-r 

<b  — — 

b)  sup  E (S'  | cp  ) > k-r-1 . 

G - - “ 

Proof.  If  cp_  = R,  , k-r=l  and  k-r=0  so  (a)  and  (b)  are  obviously  true. 


For  any  cp  e j^\{R  },  using  the  notation  defined  in  the  proof  of 
Theorem  5.1  we  have 


sup  EQ  (S  | cp  ) _>  lim  E0  (S|cp  ) 

G — — n-*°°  -n  ~ 

k 

(5.8)  _>  lim  l P0  (select  irm | cpa) 

m=r+l  -n  - 


Theorem  5.1  proved  the  first  k-r-1  terms  converge  to  one  in  the  limit.  For 

every  x e A , x,  is  the  largest  coordinate  so  A cS,  for  every  n.  Thus  (5.7) 

holds  with  j=k.  Hence  the  bound  k-r  for  (a). 

From  (5.5),  is  the  best  population  for  all  0^.  Thus  using  the  same 

reasoning  as  above,  excluding  the  term  P0  (select  ir^  | cpg)  in  (5.8),  yields  the 

-n  — 

bound  k-r-1  for  (b) . | J 


Corollary  5.1.  Let  cpa  e i?\{Rj}.  Let  r = min{i:  a^  > C).  Then 

# 

(a)  if  P < (k-r)/k,  cp  is  not  minimax  with  respect  to  S 

cl 

♦ 

(b)  if  P < (k-r-l)/(k-l) » <f  is  not  minimax  with  respect  to  S'. 

a 
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* * * * 

Proof.  The  "no  data  rule",  cp.  (x)  = P , has  sup  E0  (S|<p  ) = kP  < k-r  <_  sup 

1 © - fc 

EQ  (S|cpa).  Hence  (a)  is  true,  (b)  is  analogous.  | | 

Corollary  5.2.  (a)  If  P < 2/k,  no  rule  in  %\{R^}  is  minimax  with  respect 

to  S.  (b)  If  P < l/(k-l),  no  rule  in  is  minimax  with  respect  to  S'. 

Proof.  Any  rule  in  .S^URj}  has  r <_  k-2.  So  the  smallest  upper  bound  in 
Corollary  5.1(a)  is  2/k.  Hence  (a)  is  true,  (b)  is  analogous. || 
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